571 research outputs found

    tsdownsample: high-performance time series downsampling for scalable visualization

    Full text link
    Interactive line chart visualizations greatly enhance the effective exploration of large time series. Although downsampling has emerged as a well-established approach to enable efficient interactive visualization of large datasets, it is not an inherent feature in most visualization tools. Furthermore, there is no library offering a convenient interface for high-performance implementations of prominent downsampling algorithms. To address these shortcomings, we present tsdownsample, an open-source Python package specifically designed for CPU-based, in-memory time series downsampling. Our library focuses on performance and convenient integration, offering optimized implementations of leading downsampling algorithms. We achieve this optimization by leveraging low-level SIMD instructions and multithreading capabilities in Rust. In particular, SIMD instructions were employed to optimize the argmin and argmax operations. This SIMD optimization, along with some algorithmic tricks, proved crucial in enhancing the performance of various downsampling algorithms. We evaluate the performance of tsdownsample and demonstrate its interoperability with an established visualization framework. Our performance benchmarks indicate that the algorithmic runtime of tsdownsample approximates the CPU's memory bandwidth. This work marks a significant advancement in bringing high-performance time series downsampling to the Python ecosystem, enabling scalable visualization. The open-source code can be found at https://github.com/predict-idlab/tsdownsampleComment: Submitted to Software

    Plotly-Resampler: Effective Visual Analytics for Large Time Series

    Full text link
    Visual analytics is arguably the most important step in getting acquainted with your data. This is especially the case for time series, as this data type is hard to describe and cannot be fully understood when using for example summary statistics. To realize effective time series visualization, four requirements have to be met; a tool should be (1) interactive, (2) scalable to millions of data points, (3) integrable in conventional data science environments, and (4) highly configurable. We observe that open source Python visualization toolkits empower data scientists in most visual analytics tasks, but lack the combination of scalability and interactivity to realize effective time series visualization. As a means to facilitate these requirements, we created Plotly-Resampler, an open source Python library. Plotly-Resampler is an add-on for Plotly's Python bindings, enhancing line chart scalability on top of an interactive toolkit by aggregating the underlying data depending on the current graph view. Plotly-Resampler is built to be snappy, as the reactivity of a tool qualitatively affects how analysts visually explore and analyze data. A benchmark task highlights how our toolkit scales better than alternatives in terms of number of samples and time series. Additionally, Plotly-Resampler's flexible data aggregation functionality paves the path towards researching novel aggregation techniques. Plotly-Resampler's integrability, together with its configurability, convenience, and high scalability, allows to effectively analyze high-frequency data in your day-to-day Python environment.Comment: The first two authors contributed equally. Accepted at IEEE VIS 202

    Do Not Sleep on Linear Models: Simple and Interpretable Techniques Outperform Deep Learning for Sleep Scoring

    Full text link
    Over the last few years, research in automatic sleep scoring has mainly focused on developing increasingly complex deep learning architectures. However, recently these approaches achieved only marginal improvements, often at the expense of requiring more data and more expensive training procedures. Despite all these efforts and their satisfactory performance, automatic sleep staging solutions are not widely adopted in a clinical context yet. We argue that most deep learning solutions for sleep scoring are limited in their real-world applicability as they are hard to train, deploy, and reproduce. Moreover, these solutions lack interpretability and transparency, which are often key to increase adoption rates. In this work, we revisit the problem of sleep stage classification using classical machine learning. Results show that state-of-the-art performance can be achieved with a conventional machine learning pipeline consisting of preprocessing, feature extraction, and a simple machine learning model. In particular, we analyze the performance of a linear model and a non-linear (gradient boosting) model. Our approach surpasses state-of-the-art (that uses the same data) on two public datasets: Sleep-EDF SC-20 (MF1 0.810) and Sleep-EDF ST (MF1 0.795), while achieving competitive results on Sleep-EDF SC-78 (MF1 0.775) and MASS SS3 (MF1 0.817). We show that, for the sleep stage scoring task, the expressiveness of an engineered feature vector is on par with the internally learned representations of deep learning models. This observation opens the door to clinical adoption, as a representative feature vector allows to leverage both the interpretability and successful track record of traditional machine learning models.Comment: The first two authors contributed equally. Submitted to Biomedical Signal Processing and Contro

    Combined searches for the production of supersymmetric top quark partners in proton-proton collisions at root s=13 TeV

    Get PDF
    A combination of searches for top squark pair production using proton-proton collision data at a center-of-mass energy of 13 TeV at the CERN LHC, corresponding to an integrated luminosity of 137 fb(-1) collected by the CMS experiment, is presented. Signatures with at least 2 jets and large missing transverse momentum are categorized into events with 0, 1, or 2 leptons. New results for regions of parameter space where the kinematical properties of top squark pair production and top quark pair production are very similar are presented. Depending on themodel, the combined result excludes a top squarkmass up to 1325 GeV for amassless neutralino, and a neutralinomass up to 700 GeV for a top squarkmass of 1150 GeV. Top squarks with masses from 145 to 295 GeV, for neutralino masses from 0 to 100 GeV, with a mass difference between the top squark and the neutralino in a window of 30 GeV around the mass of the top quark, are excluded for the first time with CMS data. The results of theses searches are also interpreted in an alternative signal model of dark matter production via a spin-0 mediator in association with a top quark pair. Upper limits are set on the cross section for mediator particle masses of up to 420 GeV

    Search for new particles in events with energetic jets and large missing transverse momentum in proton-proton collisions at root s=13 TeV

    Get PDF
    A search is presented for new particles produced at the LHC in proton-proton collisions at root s = 13 TeV, using events with energetic jets and large missing transverse momentum. The analysis is based on a data sample corresponding to an integrated luminosity of 101 fb(-1), collected in 2017-2018 with the CMS detector. Machine learning techniques are used to define separate categories for events with narrow jets from initial-state radiation and events with large-radius jets consistent with a hadronic decay of a W or Z boson. A statistical combination is made with an earlier search based on a data sample of 36 fb(-1), collected in 2016. No significant excess of events is observed with respect to the standard model background expectation determined from control samples in data. The results are interpreted in terms of limits on the branching fraction of an invisible decay of the Higgs boson, as well as constraints on simplified models of dark matter, on first-generation scalar leptoquarks decaying to quarks and neutrinos, and on models with large extra dimensions. Several of the new limits, specifically for spin-1 dark matter mediators, pseudoscalar mediators, colored mediators, and leptoquarks, are the most restrictive to date.Peer reviewe

    Probing effective field theory operators in the associated production of top quarks with a Z boson in multilepton final states at root s=13 TeV

    Get PDF
    Peer reviewe

    Measurements of the Electroweak Diboson Production Cross Sections in Proton-Proton Collisions at root s=5.02 TeV Using Leptonic Decays

    Get PDF
    The first measurements of diboson production cross sections in proton-proton interactions at a center-of-mass energy of 5.02 TeV are reported. They are based on data collected with the CMS detector at the LHC, corresponding to an integrated luminosity of 302 pb(-1). Events with two, three, or four charged light leptons (electrons or muons) in the final state are analyzed. The WW, WZ, and ZZ total cross sections are measured as sigma(WW) = 37:0(-5.2)(+5.5) (stat)(-2.6)(+2.7) (syst) pb, sigma(WZ) = 6.4(-2.1)(+2.5) (stat)(-0.3)(+0.5)(syst) pb, and sigma(ZZ) = 5.3(-2.1)(+2.5)(stat)(-0.4)(+0.5) (syst) pb. All measurements are in good agreement with theoretical calculations at combined next-to-next-to-leading order quantum chromodynamics and next-to-leading order electroweak accuracy

    Search for lepton-flavor violating decays of the Higgs boson in the mu tau and e tau final states in proton-proton collisions at root s=13 TeV

    Get PDF
    A search is presented for lepton-flavor violating decays of the Higgs boson to mu t and et. The dataset corresponds to an integrated luminosity of 137 fb(-1) collected at the LHC in proton-proton collisions at a center-of-mass energy of 13 TeV. No significant excess has been found, and the results are interpreted in terms of upper limits on lepton-flavor violating branching fractions of the Higgs boson. The observed (expected) upper limits on the branching fractions are, respectively, B(H -> mu t) e tau) < 0.22(0.16)% at 95% confidence level.Peer reviewe

    Measurements of Higgs boson production cross sections and couplings in the diphoton decay channel at root s=13 TeV

    Get PDF
    Measurements of Higgs boson production cross sections and couplings in events where the Higgs boson decays into a pair of photons are reported. Events are selected from a sample of proton-proton collisions at root s = 13TeV collected by the CMS detector at the LHC from 2016 to 2018, corresponding to an integrated luminosity of 137 fb(-1). Analysis categories enriched in Higgs boson events produced via gluon fusion, vector boson fusion, vector boson associated production, and production associated with top quarks are constructed. The total Higgs boson signal strength, relative to the standard model (SM) prediction, is measured to be 1.12 +/- 0.09. Other properties of the Higgs boson are measured, including SM signal strength modifiers, production cross sections, and its couplings to other particles. These include the most precise measurements of gluon fusion and vector boson fusion Higgs boson production in several different kinematic regions, the first measurement of Higgs boson production in association with a top quark pair in five regions of the Higgs boson transverse momentum, and an upper limit on the rate of Higgs boson production in association with a single top quark. All results are found to be in agreement with the SM expectations.Peer reviewe

    Observation of tW production in the single-lepton channel in pp collisions at root s=13 TeV

    Get PDF
    A measurement of the cross section of the associated production of a single top quark and a W boson in final states with a muon or electron and jets in proton-proton collisions at root s = 13 TeV is presented. The data correspond to an integrated luminosity of 36 fb(-1) collected with the CMS detector at the CERN LHC in 2016. A boosted decision tree is used to separate the tW signal from the dominant t (t) over bar background, whilst the subleading W+jets and multijet backgrounds are constrained using data-based estimates. This result is the first observation of the tW process in final states containing a muon or electron and jets, with a significance exceeding 5 standard deviations. The cross section is determined to be 89 +/- 4 (stat) +/- 12 (syst) pb, consistent with the standard model.Peer reviewe
    • 

    corecore